K-Space at TRECVid 2008
نویسندگان
چکیده
In this paper we describe K-Space’s participation in TRECVid 2008 in the interactive search task. For 2008 the K-Space group performed one of the largest interactive video information retrieval experiments conducted in a laboratory setting. We had three institutions participating in a multi-site multi-system experiment. In total 36 users participated, 12 each from Dublin City University (DCU, Ireland), University of Glasgow (GU, Scotland) and Centrum Wiskunde & Informatica (CWI, the Netherlands). Three user interfaces were developed, two from DCU which were also used in 2007 as well as an interface from GU. All interfaces leveraged the same search service. Using a latin squares arrangement, each user conducted 12 topics, leading in total to 6 runs per site, 18 in total. We officially submitted for evaluation 3 of these runs to NIST with an additional expert run using a 4th system. Our submitted runs performed around the median. In this paper we will present an overview of the search system utilized, the experimental setup and a preliminary analysis of our results. 1 Overview of K-Space K-Space is a European Network of Excellence (NoE) in semantic inference for semi-automatic annotation and retrieval of multimedia content [1] which is in the second year of its three year funding. It is coordinated by Queen Mary University of London (QMUL) and the partner responsible for coordinating the K-Space participation in TRECVid is Dublin City University. K-Space is focused on the research and convergence of three themes: content-based multimedia analysis, knowledge extraction and semantic multimedia. 2 Search Experiment Overview As stated in the abstract, our participation in TRECVid 2008 interactive search was to conduct one of the largest interactive video information retrieval experiments in a laboratory setting to date. Our motivation for this was to conduct an investigation into interactive multimedia retrieval which seeked to tease apart as many influencing variables as possible. This paper will primarily detail the systems used in the experiment, our experimental parameters and an initial examination of the results. 3 Common Search Engine The three user interfaces developed for the search experiment leveraged a common search engine. Components of this common engine leveraged previous contentanalysis techniques of K-Space partners that were used in TRECVid 2007. The following briefly details these components, a more complete explanation of these components can be found in last years TRECVid publication [23]. As no common keyframe set was released by TRECVid we extracted our own set of keyframes. Our keyframe selection strategy was to extract every second I-Frame from each shot. We extracted low-level visual features from K-frames using several feature descriptors based on the MPEG-7 XM. These descriptors were implemented as part of the aceToolbox, a toolbox of low-level audio and visual analysis tools developed as part of our participation in the EU aceMedia project. We made use of six different global visual descriptors. These descriptors were Colour Layout, Colour Moments, Colour Structure, Homogeneous Texture, Edge Histogram and Scalable Colour. A complete description of each of these descriptors can be found in [14]. We also segmented the keyframes and extracted region based features. This processing was made available to all K-Space partners, further details available in last year’s paper [23]. 3.1 Institute EURÉCOM The Eurecom system for this year employs the approaches used in KSpace TRECVid 2007 HLFE task [4][23], for 36 semantic concepts of the 2008 test collection, in the goal of an incorporatation into the DCU multi-site search system. The Eurecom approach is based on a multi-descriptor system. These descriptors are introduced in separate SVM classification systems (one classifier per feature) trained using the first half TRECVid 2007 development data set. The fusion of classifiers outputs was finally provided by training a neural network based on evidence theory NNET [5] on the second half of the training data set. Five runs are submitted using different types of descriptors provided by EURECOM, DCU, JRS and TUB: 1. Run 1: MPEG-7 global descriptors. 2. Run 2: MPEG-7 region descriptors. 3. Run 3: Combination of MPEG-7 global with TUB face detector and JRS motion activity descriptors. 4. Run 4: Combination between all descriptors (DCU MPEG-7 global and region, TUB face detector and JRS motion activity descriptors). 5. Run 5: Color and texture descriptors are extracted using three segmentation methods (A fixed image grid, watersheds [21] and a technique based on Minimum Spanning Trees MST [8]). 3.2 Institut TELECOM features We have proposed the same audio features as last year [23]. These features are deduced after the outputs of an audio classification system which is designed to discriminate 17 different classes of sound, namely clean speech, noisy speech, music, music and speech, silence/pause and various environmental sounds (i.e. airplane, helicopter, applause, crowds, dogs, explosion, gun-shot, car, racecar, siren, truck/lorry/bus, motorcycle). The fraction of each class positive outputs over a video shot length are used as audio features.
منابع مشابه
K-Space at TRECVID 2007
In this paper we describe K-Space participation in TRECVid 2007. K-Space participated in two tasks, highlevel feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept d...
متن کاملKnowledge Base Retrieval at TRECVID 2008
This paper describes the Knowledge Base multimedia retrieval system for the TRECVID 2008 evaluation. Our focus this year is on query analysis and the creation of a topic knowledge base using external knowledge base information.
متن کاملLIG and LIRIS at TRECVID 2008: High Level Feature Extraction and Collaborative Annotation
This paper describes our participations of LIG and LIRIS to the TRECVID 2008 High Level Features detection task. We evaluated several fusion strategies and especially rank fusion. Results show that including as many low-level and intermediate features as possible is the best strategy, that SIFT features are very important, that the way in which the fusion from the various low-level and intermed...
متن کامل